Simulated annealing for mathematical problems

We now want to focus on the features of simulated annealing which make it a successful method of extremization for many mathematical problems [2]. Consider an arbitrary function of several variables. If the function is simple enough and depends only on a small number of variables, there are very efficient exact numerical methods to find its local minima. They all are based on essentially the same principle. Pick initial values for each variable and compute the function. Change the variables slightly in such a way that the function value decreases. Repetition allows one to come arbitrarily close to a local minimum. Different algorithms have been invented to optimize the progress at each iteration for different classes of functions.

Whenever one of these methods is applicable, e.g. find the minimum of f (x) = x2, it usually works much better than simulated annealing, but there are several types of problems where there is virtually no alternative to simulated annealing. As it turns out, these exact algorithms are 'greedy' by design, that is given a starting point they greedily walk downhill (following the steepest descent) until they have found a local minimum. But if one is looking for a global minimum among many local ones, and if one does not know before-hand where to look for it, one will never find it. Simulated annealing does much better in finding at least an approximate solution to a global minimum since it allows uphill moves. Another area were exact numerical methods fail is when the function depends on a large number of variables. Finally, simulated annealing works equally well for non-differentiable (even discontinuous) functions while exact methods often require the existence of the derivative of the function.

Perhaps the reader feels that already far too much as been said about the physical and mathematical background of simulated annealing. After all, consider the simple prescription for simulated annealing to find the minimum of a function of many variables that can be distilled from the above. Given is a space of configurations described by some variables and a function which assigns a number to each configuration. To approximate the global minimum we perform small random moves in configuration space. If the function decreases, we accept the move. If the function increases, we accept the move with a probability which decreases exponentially with the increase in the function and also decreases exponentially with the inverse of the temperature. The temperature is decreased according to our annealing schedule which should leave enough time for proper thermalization. The reason I gave so much background is that even though the procedure of simulated annealing may look trivial, it should not be underestimated.

We conclude this section with two examples from combinatorial optimization, which is in many ways similar to our goal, to analyze game trees. Simulated annealing has, as far as practical applications are concerned, solved the famous 'traveling salesman problem' [2,3]. The problem is to find the shortest path that connects N cities. The same type of problem arises when one wants to optimize the layout of integrated circuits, or in a phone system with several possibilities to locate the switches. This problem is an example for combinatorial optimization since the configuration space on which a function (length of path) is to be minimized is a discrete, factorially large space. In the traveling salesman problem, there are N! = N*(N - 1)*…*1 possibilities to make an ordered list of N cities. The computation time to find the solution by an exhaustive search increases exponentially with N, that is, the traveling salesman problem is NP-complete. Simulated annealing achieves a good approximation to the optimal solution in computing time which grows with N only as a small power of N. The procedure starts with an arbitrary path. Two types of local changes are performed. One move is to reverse the order of several cities which are next to each other on the path, and the other move is to remove several neighboring cities from the list and move them to a different position in the list.

For a second example see [4], where the problem is to find the arrangements of 25 playing cards in a 5x5 tableau such that the value of the rows, columns, and diagonals interpreted as hands for poker is maximized. Again, simulated annealing is successful and much faster and simpler than other methods.